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ABSTRACT 

The criterion problem in the neoPiagetian 
cognitive-developmental literature is reviewed. This problem has 
provoked debate in the context of three empirical questions: concept 
age norms* concept training effects, and concept invariant sequences. 
It is argued that only the question of invaniant sequences in 
same-stage concepts is theoretically crucial. Nith regard to this 
question it is shown that both Type I ("false positive") and Type II 
("false negative") criterion errors tend to produce data which are 
spuriously supportive of theoretical predictions. It is concluded 
that the objective psychometric consequences of criterion errors 
provide no grounds for the current view that obtaining or not 
obtaining theory-supportive findings in concept development research 
turns on arbitrary decisions about response criteria. (Author/CS) 
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This paper is concerned with a debate over the response criteria used 
to infer the presence of Ptagetian stage-related concepts which has been 
going on in the cognitive-developmental literature for more than a decade. 
The debate began during the early 1900s with a series of exchanges between 
Hraine (1962, 1964) and Smadslund (1963, 1965) over transitivity criteria. 
Shortly thereafter, the debate was generalized to the measurement of other 
Piagotian concepts In papers by Gruen (1966) and Smedslund (1969). The 
present decade has witnessed a continuation of the original disagreement 
over transitivity criteria in the form of an exchange between Trabasso and 
his associates (Bryant, 1973; Bryant ft Trabasso, 1971; Riley ft Trabasso 
1974) and Youniss and Furth (1973). Further discussions of the general 
version of the criterion problem, which includes the transitu ity disagree- 
ment as a special case, also have appeared recently (Brainerd, 1973a, 1973b, 
1973c, 1974a; Brainerd ft Hooper, 1974; Kuhn, 1971; lieese ft Schack, 1974). 

In its general form, the criterion problem is concerned w th the 
minimum behavioral evidence an investigator requires before he will conclude 
that a qiven subject possesses some stage-related concept. On the one hand, 
the Genevans (e.g., Inhelder, Bovot, Sinclair, ft Smock; Inhelder ft Sinclair, 
1969; Sinclair, 1973) and investigators who favor a more or less orthodox 
Piagetian view of cognitive devclopm-nt (e.g., I.asry ft Laurendeau, 1969; 
Smedslund, 1963; Strauss, 1972; Youniss ft Furth, 1973) rveuire very strong 
evidence. On th?) otl.or hand, less orthodox investigators (e.g., Brain;.-, 
19&9; Braine ft Shanks, 1965a, 1965b, F.runer, 1964; Bryant ft Trabasso, 1971; 
Mehler ft Cover, 1967) will accept so.tiouhat weaker evidence. For reasons that 
have never been explicitly stated, investigators from the former group seem 
to view Ty,;e I assessment errors ("false positive" diagnoses) as far more 
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heinous than Typo II assessor* errors ("false negative" diannoses) -hence, 
thtiir conservative criteria. In contrast, investigators from the latter 
group tend to regard Type I and Type II errors as equally objectionable- 
hence, their more liberal criteria. 

To illustrate some of the extant differences between "conservative 
criteria" and "liberal criteria," consider the ubiquitous liquid quantity 
conservative problem. We begin with two identical glasses containing equal 
amounts of water and then pour the contents of one glass into either a wider 
or narrower container. If this problem is afn inistered according to the 
usual conservative criterion (e.g., Inhelder & Sinclair, 1969), subjects 
would be required to do at least three things: (a) judge whether or not 
the two quantities are still equal; (b) explain each judgment; (c) reply 
to a countersugqestion by the experimenter about each judgment. Any given 
administration of the problem is scored as a "pass" only if the subject 
judges that the quantities are still equal, provides an explanation which 
indicates that he grasps the underlying logic of the nroblem, and resists 
the experimenter's suggestion that the two quantities are in fact unequal. 
If the problem is administered according to the usual liberal criterion 
(e.g., Brainerd, 19741,), subjects would be required only to judqe 
whether or not the two quantities are still equal after transformation. 
Any given administration of the problem is scored as a "pass" if the subject 
makes an equivalence judgment. The subject is said to possess conservation 
if the ratio of equivalence judgments to difference judgments departs 
significantly from chance expectations across a series of administrations 
of the pro!,! (.-in. 

The conservative criteria etiployed by orthodox invest ign tors *re open 
to the criticism that many subjects who possess U»- concepts being assessed 



will not ho able to meet the strinqent criteria. In the preceding illus- 
tration, for example, component b is suspect. I have shown elsewhere on 
theoretical grounds that rxplanations introduce at least two specific sources 
of Type II error (Bralnerd, 1973b, 1974a). Moreover, Siegel (1974a, 1974b) 
recently has shown that children's capacity to use the language tapped by 
component b lags far behind their grasp of the concepts themselves. On the 
other hand, the more liberal criteria employed by less orthodox investigators 
are open to the criticism that some subjects who do not possess the concept 
being assessed will be able to satisfy the criteria. In the preceding 
illustration, for example, it is theoretically possible for a subject to 
adopt an equivalence response set and pass every trial without understanding 
conservation. More generally, Smodrlund (1969) has argued that most Piagetian 
concept assessments are susceptible * "irrelevant hypotheses" (e.g., equi- 
valence sets in the case of conservat jn) which allow subjects to generate 
correct judgments without possessing che concept. Although at present 
there does not appear to be any unequivocal empirical support for tlra 
existence and use of irrelevant hypotheses (cf. Bra 1 nerd, 1973b, Footnote 3, 
p. 176), many investigators have viewed Stned&l und's argument as logically 
persuasive (e.g., Reese fi Schacfc, 1974). 

In short, researchers who study Piagetian concepts and seek to test 
predictions fron the theory are confronted with the following dilemma: 
Is it better to adopt conservative Genevan criteria and run the risf. of 
counting Type II errors or is it better to adopt more HMral criteria 
and run the risk of co.i-ni Lt ing Typo I errors? Does a criterion which may 
incorporate Type II error or a criterion which may incorporate Typo I error 
provide a fairer test of the ti.tiory? To answer this question, we must know 
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what the objective consequences of Type 1 and Type II errors are for the 
theory. V'e shall consider litis problem below. First, the empirical signi- 
ficance of the criterion problem will be discussed as a means of narrowing 
its scope and sharpening its focus. We shall sea that the criterion problem 
is .of critical theoretical significance only in studies concerned with the 
order of emergence of two or more same-stage concepts. Next, the effects 
of Type I and Type II errors on expected frequencies for samples drawn from 
populations which do and do not support thaoretical predictions will be 
reviewed. Contrary to the frequently acknowledged opinion that liberal criteria 
produce data which are less favorable to the Piagetian theory than conservative 
Genevan criteria (e.g., Gruen, 1966; Strauss, 1972), we shall see that 
Type I and Type II errors both favor the theory. 

Empirical Significance of the Criterion Problem 
In the cognitive-developmental literature, the criterion problem has 
been a source of controversy in the context of three specific questions; (a) 
the age of eneroence of individual Piagetian concepts; (b) the trainability 
of individual Piagetian concepts; (c) the order of emergence of two f more 
same-stage Piagetian concepts. Concerning a, the age noivis for concepts 
such as transitivity and conservation are two or three years older with 
Genevan criteria than with more liberal criteria (e.g., Braine, 1559; 
Braine S Shan!-,,, l r vv M) Brainord, 1973a; Bryant ?< Trab:isso, 1971). Con- 
cerning h, it his been suggested (Gruen, i%6; Kuhri, 1974; Strauss, 19/2) 
that train if io effects observed with more liberal criteria nay not be observed 
with Genevan criteria. Concerning c, two concepts belonging to the sjitm 
Piagetian- stage of n^nt-:! development ar*> less frequently observed to emerge 



in a fixed order with Genevan criteria than with more liberal criteria 
(cf. Bralnord, 1974a; Brat nerd ft Hooper, 1974). 

Age norm, training effects, and emergence orders all have provoked 
their share of debate in the literature. However* disagreements in the 
literature notwithstanding, the first two questions are not especially 
crucial from the standpoint of the theory. Concerning ape norms, it has been 
observed elsewhere that "age norms are trivial issues from the standpoint 
of Piagetian theory because the theory leaves wide latitude regarding them 
[Bra. nerd, 1973b, p. 173]." If, for example, it could be demonstrated that 
most concrete-operational concepts appear a full two years earlier than the 
nominal 7-to-8~year old norm, the theory would not be substantively affected. 
Second, concerning training effects, the suggestion that such effects may be 
observed with liberal criteria but not with conservative criteria would be 
important only if, as was once supposed (e.g., Mermelstein ft Meyer, 1969; 
Smedslund, 1961), the theory specified that static-related concepts cannot 
be trained. Although there has been some confusion in the literature over 
precisely what the predictions of the theory are vis-a-vis training, this 
issue has been clarified in recent years (Brainerd ft Allen, 1971; Inhelder 
ft Sinclair, 1969; Sinclair, 1973). Clearly, th? theory predicts training 
effects (cf. especially Sinclair, 1973, pp. 57-58). Moreover, the con- 
jecture that these effects are visible with liberal criteria but not with 
Genevan criteria has not received much sunport fro.n nooPiagetian training 
experiments. In a recent review of this literaLure (prainerd, 1973c), it 
was noted that, to dale, choice of criterion has not. boon a critical 
variable in training experiments. The typical finding has b«?en that a given 
treatment produces a training effect with both sorts of criteria, but the 
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effect is more pronounced with liberal criteria. 

Unlike a_and b, the third question is crucial from the standpoint 
of the theory. Hence, let us be precise about what this question is con- 
cerned with. Three general types of asynchrony are possible dth Piagetian 
concepts. First, there arc hottveen-stage asynchronies, which the theory 
terns vertical [decajafles. Between-stage asynchronies come in two varieties: 
within-concept and between -concept. Concerning the former, we are given two 
or more versions of same general concept which . stonslhly belong to 
different globul stages of mental development and they are observed to emerge 
in a fixed order. Quantity conservation (concrete-operational stage) and 
density conservation (formal -operational stage) are a case in point. Con- 
cerning between-stage/between-concept asynchronies, we are given two or 
more distinct concepts which ostensibly belong to different stages and the> 
are observed to emerge in a fixed order. Number conservation and impl icatior. 
reasoning are a case in point. With respect to loth versions of v ertical 
decajjicje, the prediction, of course, is that the item from the earlier stige 
invariably is acquired before the item from the later stage. The second 
type of asynchrony is within-stage/uithin-concept, which the theory terns 
!l°.CL 2 PJ}.^l f !®P*! a P£« For example, children are known to conserve in some 
nominally concrete-operational areas (e.g., number length) long before they 
conserve in others (e.g., weight, area). Although the theory nukes no 
directional prediction, about this type of asynchrony, it presently accomo- 
dates decajaocs on an ad hoc basis (cf. Flavell ft h'ohlwill, l%g). 
The third and most interesting form of asynchrony is withtn-stane/beU;eeti- 
concept. Here, we are given two or more distinct conc^ts which ostensibly 
belong lo the same stage and th<->y are observed to emerge in a fixed order. 
The concrete-operational concepts of transitivity and conservation provide a 



standard Illustration. It is with respect to with in -stage/be tween-concept 
asynchrony that the criterion problem become^ of paramount importance. 

Within-stage/between-concept asynchrony is more interesting than 
either between -stage or within-staqe/wi thin-concept asynchrony because it 
is generally proscribed by the theory (cf. Flavell & Hohlwill, 19F9; Pinard 
I taurendeau, 19G9). Moreover, existing ncoPiagetian models, which seek to 
refine Piaget's stages by adding quantifiable transition parameters (e.g., 
Case, 1972; Flavell ft Hohlwill, 1%9; Pa scual -Leone, Pulos, ft Parkinson, 
1974; Pascual -Leone & Smith, 1909), also cannot accomodate this type of 
asynchrony. It is Piaget's structurcs-of-the-whole principle which pre- 
cludes within-stage/bstween-concept asynchrony (Bra i nerd, 1971c; Pinard & 
Laurendeau, 1969). According to this principle, the concepts which 
characterise each global stage of mental development ell presuppose the same 
set of tighlly knit cognitive structures. During the course of each stage, 
the concepts identified with that stage are generated from the structures 
in much the same manner that a mathematician would deduce theorems from 
previously given proof procedures. Because the structures which define a 
given sUge arc so tightly knit, the theory stipulates that they must, emerge 
synchronously. \h tMn-sUgo/wi thin-concept a synchronies are possible 
because the theory acfcno*.«i edges the somewhat animistic possibility that 
certain concent areas "resist" application of the structures more than other 
content ar;as. Thus, riyn!*or conservation precedes quantity conservation 
because, for some unknown reason, quantity content resists application of the 
concrete-operational groin .-merit structures nare than number content does. 
However, when we consider t.;o distinct concepts v.tich are believed to 
presuppo.. tne s«ti.:«« t«.;J-rly nrj structures (i.e., h^l t;»itj to the saiv s*»<»") 
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ttjtMn the sang, content area « the. resistance argument cannot be invoked 
because content is held constant. If we compare transitivity of length with 
conservation of length, for example, the content area is the same and, hence, 
the two concepts should emerge synchronously (Brainerd, 1974c; Daqenais, 
lf>73; Pinard & Laurcndeau, 1959). If, instead, it is observed that one of 
the two concepts invariably precedes the other in most content areas, then 
the claim that both concepts presuppose the same level of cognitive struc- 
tural ion seems dubious. 

The structures-of-the-whole principle notwithstanding, within-stage/ 
between-concept asyncrhonies may indeed occur. They have been observed in 
profusion during recent years with concrete-operational concepts. Moreover, 
the asynchronies which have been reported seem to suggest a general under- 
lying pattern which has been commented on elsewhere (Brainerd, 1974c; Burke- 
flerkle ft Hooper, li;75). The structures which all concrete-operational 
concepts are said to presuppose are the so-called groupements. There are 
eight of these structures in all. Four of them are concerned with set- 
theoretic operations predicated on nested classes (cf. Piaget, 1949, pp. 109- 
125) and the remaining four are concerned with arithmetic operations pre- 
dicated on asymmetrical anJ symmetrical relations (cf. PiVgei, 194*?, pp. 141- 
179). Iktnco, concrete-operational concepts tend to fall into two broad 
categories—those concerned primarily with reasoning about classes (e.g., 
class inclusion, double classification, cardinal number) and those concerned 
primarily with reasoning about relations (e.g., transitivity, double seriation, 
ordinal nurtber). Suppose we think of the latter group of concepts as 
comprising a "relational dimension" and the former group of concepts as 
comprising a "classi ficat.ory dimension." Recent developmental evidence 
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suggests that, contrary to structures-of-the-whole, children make consider- 
able progress with the relational dimension before they make much progress 
with the classificatory dimension. Taken together, findings on four within 
stage/be twaen- concept ^synchronies surest this underlying pattern: trans! 

tivity vs. class inclusion (Brainerd, 1973a, 1974b; Brainerd & Vand- i 
Heuvel, 1274, double seriation vs. double classification (Hooper et a1.» 
1974); ordinal number vs. cardinal number (Brainerd, 1973d, 1973% 1974d; 
Brainerd 5 Fra<c.«r, 1975; Gonchar, 1974; Siegel, 1974c); relational groupe- 
inent operations v:. classificatory groupement operations (Brainerd, 1972; 
bihoff, 1974; Ueinreb f« Brainerd, 1975). In each instance, the relevant 
relational concepts were observed to emerge before the relevant classifi- 
catory concepts. 

Although the data reported in the studies just mentioned is both 
substantial and suggestive of an underlying pattern, liberal nonGenevan 
criteria were employed in all the studies. Therefore, u is possible for 
more orthodox investigators to invoke the Type I error criticism discussed 
earlier. Further doubts are raised by empirical evidence which suggests 
that some of the preceding a synchronies are not observed with corscrvative 
Genevan criteria (Oigeiiais, 1973; Gonchar, 1974; Lcnerise, 1974; Shc-ppard, 
1974; StiuxMutirf, 1964), Moreover, Genevan criteria are know.i to reduce 
the possibility of finding asyncrhonics in same-stage concepts other than 
those mentioned above* ( Brainerd & Brainerd, 1972; Brainerd, 1974a; 
Brainerd ft Hooper, 1974). 

To stiir .-arize, v/e are confronted with two general problems. First, 
th?re is thr- debate o/er response criteria. Genevan-oriented investigators 
have argued that Type 1 errors are commit leu witli liberal criteria on-j le^s 
orthodox i nve» tiga tor r have argued that Type 11 errors are coi:*nitU-J with 



cons or votive rriteria. The second problem concern:*, the conflicting findings 
on withiii-stagc/l.etv:j.M,-coijcept asynchrony. Recent, evidence suggests that, 
asymhronies of this theoretically crucial sort are observed more frequently 
with liberal criteria than with conservative criteria. Therefore, we cannot 
decide which finding:, are more believable or, vastly more important, how to 
proceed in future studies until we know precisely what effects Type I and 
Type II errors have on the null hypotheses tested in such studies. Is it 
possible, as Genevan reasoning implies, that Type I errors can manufacture 
asynchrouy from synchrony? Is it possible, as the reasoning of other . 
investigators implies, that Type II errors can mask real asynchrony? If 
one of these questions c,n be answered negatively and the other answered 
affirmatively, then we shall have a resolution of the criterion problem for 
Studies concerned with within-stage/between-concept asynchrony. 

Effects of Type I and Type II Lrrors on Different Populations 
We must separately evaluate the effects of Type I and Type II errors 
on two different populations: theory-supportive popuations (i.e., those in 
which sauif-stage concepts emerge synchronously) and theory-contradictory 
population, (i.e., those in v:hicb samp-stage concepts emerge asynchronously). 
With th?ory-su:portive populations, we are especially concerned to know 
whether or not Typa I error increases th.f probability of obtaining asyn- 
chronous data. With theory-ton trad ic tory populations, wo are especially 
concern^.! to know whether or not Type 11 error increases the probability 
of obtaining synchronous data. 

The structure of a rencopt dcvelopm-nt study m,..y \<> uV.cribod as 
follows. l.V are given tv.'. (or more) concepts, A <md I), which are known 
to a.^rge during son.e ago range R. Test-; of A and B atf ad.iirtisf.-r.'d to 
a sample fro.n R. The subjects' responses are scor.nl rf^.ording to a 
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a conservative or lil^ral criterion an«! the scores are us^d to assign each 
subject to cne of four cat-Tories: pass A-pass B (A/U), pass A-faii R 
(A/ B), fail A-pass tJ ( A/b), and fail A-fail B ( A/ D). The appropriate- 
null hypothesis is that the observed frequencies of the second and third 
categories do not differ significantly. An exact significance test for tin's 
hypothesis is given by 

P(x) - £)P- X - q"-* [1] 

where x = either the number of A/ B subjects or the number of -A/B subjects, 
II = the total number of A/ B subjects and 'A/3 subjects, P = .50, and Q « .50. 
For three or pore concepts, Eq. 1 provides an exact significance test for 
each of the pain/ise null hypotheses. Approximate tests of significance 
for three or more concepts are provided by less cumbersome scalometric 
procedures (e.g., Green, 19b4). The important fact to bear in mind is that 
subjects classified a* A/D or "A/'tf are irrelevant to the null hypothesis and, 
hence, we rust concern ourselves with the effects of Type I and Type II 
criterion errors on the relative observed frequencies of categories A/ B 
and A/tf. 

IA^ r 9J^%9l!i!.^ j'fal 1 Hypothesis True 

Ass-.m:.- that the i^ts pre administered to a sample drawn from a 
population if. v.*iich A and B emerge synchronously. Th<; i.-apulation frequencies 
for the four possible types of subjects ere: P(A/B) « Pj , P(A/ b) * p^, 
P( - P 2 , and P( A/ b) = l> 3 , where P ] + 2P., * ? 3 = 1. Because A and b 
are not acqured in ar.y particular order, it ir.uU be the case tint Uwit 
population frequencies are equal. Suppose that the ie p;<nses of our subjects 
arc scnr.d .v.coHiir; \a t < Iff :r«-:l criterion with a Ty ; .;,» 1 error rnt.v of 
0 ' x < 1. How will this uf fect the expected frequencies of the four 
Su b j ec I c • : tf :'j ' > r i e > ? 
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Concerning category AfC t tue probability that any subject who belong 
to this category will be correctly classified by our criterion is unity. Foj 
category A/B, the prohibit ity that any subject who belongs to this category 
will be correctly class i fie J by our criterion is 1 - x.. The probability 
ttijt an A/*B subject will be fnccorectly assigned to the A/B category is x. 
For category A/B, the probability that any subject who belongs to this 
category will be correctly classified by our criterion is 1 - x. Tbe 
probability that a ^A/D subject will be incorrectly assigned to the A/B 
category is x. For category ~A/~B. the probability that any subject who 
belongs to this category will be correctly classified by our criterion is 
I - 3x . The probability that, a V B subject will be incorrectly assigned 
to categories A/B, A/~B, and A/B, respectively, is x 2 in each case. Hence, 
the expected frequencies of titef our categories for any sample drawn from 
this population are : 

n(A/D) *Pj +■ 2xP 2 + x 2 P 3 , [2] 
E(A/ B) *I> 2 [1 - + x 2 P 3 , [2] 
C( A/n] -? 2 [1 - x] + x 2 P 3 , and [4] 
t( A/ B) -P 3 [l - 3x 2 ]. [i,] 

«ote that if + dP z + P 3 = 1 , then Fq. 2-Eq. b must sum to 1 also. 

The principal effect of Type I errors with theory-supportive; samples 
is on the first and last categories. The expected valuL' for the first 
category is substantially increased and th? expected value for the last 
category is substantially decreased, relative to their papulation values. 
As mentioned earlier, however, be are concerned primarily with the effects 
of criteria errors on th* secoricl and third categories, note Miat, from the 
standpoint of the null hypothesis tested in concept drvOopment studies, 
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Type I errors aro of no consequence. Although Type I errors decrease the* 
expected values of both A/ i> and A/B relative to their population values, 
both are decreased by the same amount. Therefore* if the population fre- 
quencies of the two categories are equal, as they must be if the population 
is tluory-supportive, then their expected frequencies after Type I error 
also are equal. Because the null hypothesis is concerned with the difference 
between the observed freqencies for the two categories, Type I criterion 
errors neither increase nor decrease the probability of obtaining asynchronous 
data given synchrony in the population. 

Now, suppose that our subjects' responses are scored according to a 
conservative criterion with a Type II error rate of 0 < £ < 1. The effect 
is essentially the reverse of the above. Again, assume that the population 
frequencies of the four categories are , P^, P,,, and respectively. 
For category A/B, the probability that any subject who belongs to this 
category will be correctly classified by our criterion is 1 - 3y . The 

probability that an A/B subject will be incorrectly assigned to categories 

2 

A/B, A/B, and 'A/ B is yf in each case. For category A/'B, the probability 
'that any subject who belongs to this category will be correctly classified 
by our criterion is 1 - y. The probability that an A/ H subject will be 
incorrectly assigned to the -A/'-B category is For category A/B, the 
probability that any subject who belongs to this category will be correctly 
classified by our criterion is 1 - y. The probability that a A/B subject 
will be inocrrectly assigned to the A/B category is y. Finally, for 
category A/B, the probability that any subject who belongs to this category 
will be correctly classified by our criterion is unity. Hence, the 
expected frequencies for the four categories are 
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I. (A/0) * P^l • 3^], 

l(/v u) - p 2 [i - yj + x 2 r r 



[7] 



[r>3 



E( A/B) - r 2 [l - y] + x 2 and 
E( A/B) = P 3 + 2yj> 2 + y_ 2 P r 



C9] 



[8] 



As was th3 case for Type I errors, the principal effect of Type II 
errors, given a theory-supportive sample, is to inflate the 'A/B category 
substantially and deflate the A/B category substantially. As was also the 
case for Type I errors, Type II errors do not differentially affect the 
critical A/ B and A/Li categories. If the frequencies of these two cate- 
gories are equal in the population, then their expected frequencies after 
Type II error also must be equal. Therefore, Type II errors neither increase 
nor decrease the probability of obtaining asynchronous data given synchrony 
in the population. 

Ikcpo-Contradictory Population: KulJ_ Hypothesis FaTse 

Assume that the tests are administered to a sample from a population 
in which A and D emerge in a fixed order and assume that the order is A > B. 
The population frequencies for the four possible types of subjects are: 
HW) » Pj, P(A/-B) P r P( A/B) - 0, and P(W B) - P 3 , whore P, + V ? + 
l' 3 - 1. Suppose that the responses of our subjects are scored according to 
tho same I i total criterion as before. The probability that any A/D subject 
will be correctly classified is unity. The probability that any A/*B subject 
will be .correctly classified is 1 - x and the probability that he will be 
incorrectly classified as an A/U is x. The probability that any WB 
subject will be convctly classified is 1 - 3k 2 and the probability that 
he will be incorrectly classified as an A/B or an A/ B or 0 is x 2 

in each case, lichee, the expected frequencies for tin: four categories are: 



17 



C(A/*B) - P 2 [l - xj + x 2 P 3 
E(A/B) * x 2 P 3 , and 
E(-ty~B) - P 3 [l - 3x 2 ]. 



C13] 



D2] 



[11] 



[10] 



Unlike theory-supportive samples, Type I errors do have a differ- 
ential effect on the crucial A/~B and WB categories with theory-contra- 
dictory samples. Explicitly, Type I errors tend to increase the observed 
frequency of ~A/B relative to the observed frequency of A/~B: Given that 
p 2» P 3» and - x - a11 1ie Detwee » 0 and 1, the difference between the population 
frequencies of the two categories (P 2 and 0) must be greater than the difference 
between the expected frequencies of the two categories after Type I error 
(P 2 P -x] + xP 3 and x P3). Therefore, the commission of Type I errors 
wm precise the probability of falsely accepting the null Hypothesis that 
A/B and A/B subjects occur with equal frequency in the population. Given 
a constant error rate, this masking effect will become more serious as the 
absolute magnitude of ? z decreases. If A > B is a "roLust" sequence (i.e., 
several years elapse between the onset of A and the first evidence of B), 
then the frequency of A/ B subjects in the population will be substantial 
relative to the frequencies of A/B and A/B subjects. If A ■> B is a 
"precise" sequence (i.e., only a few months elapse between the onset of A 
and the first evidence of B), then the frequency of A/"B subjects in the 
population will be small relative to the frequencies of A/B and -A/~B 
subjects. Hence, the consequences of making Type I errors will be mm 
serious for precise sequences such as identity vs. equivalence (cf. Brainerd 
& Hooper, 1974) and multiple classification vs. multiple seriation (cf. 



Brainerd, 1974c) thin for robust sequences such as transitivity vs. class 
inclusion (cf. Brainerd, 1973a). 

How, suppose that we employ the conservative criterion Mentioned 
earlier. The population frequencies of the four categories again are 
assumed to be P ] , P^, 0, and ? r respectively. The probability that any 
A/B subject will be correctly classified is 1 - 3£ 2 and the probability that 
he will be incorrectly classified as an A/~B or a ~A/B or a ~A/~B is y 2 
in each instance. The probability that any A/~B subject will be correctly 
classified is 1 - y and the probability that he will be incorrectly classified 
as a A/ B is y.. The probability that any 'A/ B subject will be correctly 
classified is unity. Hence, the expected frequencies for the four categories 
are: 



E(A/B) - Pjp - 3/], 


[14] 


f(A/*B) - P 2 [l - yj + i 2 ? v 


[15] 


E(A/B) = /p^ and 


[lo] 


E( A/ B) - P 3 + i? 2 +/p r 


D7] 


Thus, the effect of Type II errors on theory-contradictory samples is 



precisely the sane as the effect of Type 1 errors, type II errors increase 
ths observed frequency of ~A/B subjects relative to the observed frequency 
of A/~B subjects and, therefore, the probability of falsely accepting th» 
null hypothesis that A/ B and "A/B subjects occur with equal frequency in 
the population also iricreiif.es. As was also the case for lype I errors, the 
practial consequences of ccraitting Type II errors in'T^ase as P„ decreases. 
Given a constant Type II error rate, its consequences will be more serious 
for precise sequences than for robust ones. 
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Conclusions and Recommendations 
The preceding analysis has several interesting implications for the 
criterion problem with which we began. First, the claim that liberal criteria 
which incorporate Type I error tend to produce findings which are overly 
unfavorable to the theory obviously is false. Given that a population is 
theory-supportive (null hypothesis true), Type I errors neither increase 
nor decrease the probability of incorrectly rejecting the null hypothesis. 
Moreover, given that a population is theory -contradictory (null hypothesis 
false), Type I errors actually reduce the probability of correctly rejecting 
the null hypothesis. In short, regardless of whether or not liberal 
criteria do in fact incorporate Type I errors, the use of such criteria 
cannot possibly militate unfairly against Piagetian predictions. Quite 
to the contrary, if it happens that Type I errors are committed with a 
certain criterion, then tha chances of finding spurious support for the 
theory are enhanced. A second implication of the preceding analysis is that 
the claim that conservative criteria tend to produce findings which unfairly 
favor the theory obviously is true. Given that a population is theory- 
contradictory, Type II errors increase the probability that the null 
hypothesis will be incorrectly accepted. As is the case for iype I errors, 
therefore, Type II errors increase the chances of obtaining spurious 
support for the theory. 

There is one very positive outcome of the preceding analysis which 
should not go unmentioned. The analysis indicates that there is no psycho- 
metric foundation for the somewhat nihilistic view that obtaining or not 
obtaining theory-supportive findings in concept development studies is 
largely a function of arbitrary decisions about response criteria. This 
view, which has been expressed with distressing regularity in recent years 
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(e.g.. Hooper, 1974; Ku-iti, 1974), undoubtedly has boon c source of consider- 
able uncertainly to concept development re scare hers. A rapprochement now 
may be effected. Psychouetrical ly speating, it is impossible to manufacture 
spuriously asynchronous data via the co<mnission of cither Type I or Type II 
criterion errors. Hence, evidence of asynchrony which presently exists in 
the literature car not be dismissed on the grounds that the response criteria 
employed in the relevant studies incorporated either Type I or Type II errors, 
although it is always possible that this evidence could be dis-nissed on some 
other grounds. On the other hand, it H possible to manufacture spuriously- 
synchronous data via the commission of Type I and Type H errors. Therefore, 
wo must be more circumspect about accepting evidence of synchrony than we 
are about accpeting evidence of asynchrony. 

An important and provocative implication of what has just been said 
is that, from the standpoint of the criterion debate, cxistinc studies in 
which asynehronies have been observed are intrinsical ly more believable than 
existing studies in which synchronies have been observed. L'arlier, it wis 
noted that substantial evidence of theory-contradictory asy.ichrony exists 
for concrete operational concepts such as transitivity, class inclusion, 
R-tiltiplf seriafion, L-iltipie classification, ordinal number, cardinal number, 
and cjroupomeut operations using liberal criteria. It also was noted (hat some 
of these same asynchronies have not been observed in studies where 
conservative criteria wore employed. Vie now know that Type I criterion 
errors cannot possibly explain findings of th.> former sort. If Type I 
errors were made in these studies, then the reported a synchronies appearea 
in spite of them rathsr than because of th-m. tot olso know that Type V 
errors can explain finding* of the latter sort--jt least in part. In view 
of the fact thai lype I and Type II er>-<;rs both tend to produce spuriously 
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synchronous data, it is psyc horns trical 1y reasonable to suppose that the 
effective Type I error rates of the liberal criteria employed in the fir^t 
group of studies must be substantially nearer to zero than the effective 
Typo II error rates of the conservative criteria employed in the second 
group of studies. 
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